
United States Patent and Trademark Office 



UNITED STATES DEPARTMENT OF COMMERCE 
United States Patent and Trademark Office 
Address: COMMISSIONER FOR PATENTS 
P.O.Box 1450 

Alexandria, Virginia 22313-1450 
www.uspto.gov 



APPLICATION NO. 



FILING DATE 



FIRST NAMED INVENTOR 



ATTORNEY DOCKET NO. 



CONFIRMATION NO. 



09/782,434 



02/13/2001 



7590 06/09/2004 

Frank Chau, Esq. 

F. CHAU & ASSOCIATES, LLP 

Suite 501 

1900 Hempstead Turnpike 
East Meadow, NY 11554 



Martin Franz 



YOR9-2001-0011US1 
(8728-4 



9870 



EXAMINER 



JACKSON, JAKIEDA R 



ART UNIT 



PAPER NUMBER 



2655 

DATE MAILED: 06/09/2004 



Please find below and/or attached an Office communication concerning this application or proceeding. 



PTO-90C (Rev. 10/03) 



Office Action Summary 


Application No. 

09/782,434 


Applicant(s) j 
FRANZ ET AL 


Cyanii not* 

Jakieda R Jackson 


Art Unit 

2655 





-- The MAILING DATE of this communication appears on the cover sheet with the correspondence address •- 
Period for Reply 



A SHORTENED STATUTORY PERIOD FOR REPLY IS SET TO EXPIRE 3 MONTH(S) FROM 
THE MAILING DATE OF THIS COMMUNICATION. 

- Extensions of time may be available under the provisions of 37 CFR 1 .136(a). In no event, however, may a reply be timely filed 
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• Application/Control Number: 09/782,434 Page 
Art Unit: 2655 

DETAILED ACTION 
Response to Amendment 

1 . In response to the Office Action mailed January 8, 2004, applicant 
submitted an Amendment filed on April 12, 2004, in which the applicant amended 
claim 1 to include the limitations of claim 2 and canceled claim 2. Applicant has 
requested reconsideration of the amended claim 1 . Applicant also traverses 
rejected claims 11 and 19. 



Response to Arguments 

2. Applicant argues, regarding claim 2, that Gillick does not disclose "dividing 
text data", instead Gillick discloses dividing the spoken utterance. Applicant also 
argues, regarding claim 2, that Gillick does not anticipate "dividing text data for 
training a plurality of sets of coefficients into partitions, depending on word counts 
corresponding to each of the at least two language models". 

Regarding claims 1 1 and 19, applicant argues that Gillick does not 
disclose "each of the n-weights depend on n-gram history counts". Applicant 
points out that the office action cites the value of R of Gillick as disclosing "each 
of the n-weights depend on n-gram history counts," (column 16, lines 42-44), in 
which the applicant disagrees. Although that particular section does not 
specifically discloses that each of the n-weights depend on n-gram history 
counts, Gillick does teach n-gram's being the number of occurrences of the given 
n-gram (word frequency). Therefore, the applicant's arguments have been fully 
considered but they are not persuasive. 
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Claim Objections 

3. Claims 3, 4 and 7 are objected to because of the following informalities: 
• Claims 3, 4 and 7 depend on canceled claim 2. Therefore, the examiner 
has interpreted those claims as depending on claim 1 . 
Appropriate correction is required. 



Claim Rejections - 35 USC § 102 

4. The following is a quotation of the appropriate paragraphs of 35 

U.S.C. 102 that form the basis for the rejections under this section made in this 

Office action: 

A person shall be entitled to a patent unless - 

(e) the invention was described in (1) an application for patent, published under section 
122(b), by another filed in the United States before the invention by the applicant for patent or 
(2) a patent granted on an application for patent by another filed in the United States before 
the invention by the applicant for patent, except that an international application filed under 
the treaty defined in section 351(a) shall have the effects for purposes of this subsection of an 
application filed in the United States only if the international application designated the United 
States and was published under Article 21(2) of such treaty in the English language. 

5. Claims 1, 3, 5-13 and 15-24 are rejected under 35 U.S.C. 102(e) as being 
anticipated by Gillick et al. (6,167,377), hereinafter referenced as Gillick. 

Regarding claim 1, 12, 20 and 21, Gillick discloses an Automatic Speech 
Recognition (ASR) system (figure 1, element 160; column 1, lines 6-7) having at 
least two language models (variety of language models; column 2, lines 1-5), a 
method for combining language model scores (column 16, lines 8-11) generated 
by at least two language models, said method comprising the steps of: 
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generating a list (figure 1 1 , element 1 125) of most likely words for a 
current word in a word sequence uttered by a speaker (column 1 , lines 33-42), 
and acoustic scores corresponding to the most likely words (figure 9); 

computing language model scores for each of the most likely words in the 
list (column 18, lines 36-39), for each of the at least two language models; 

respectively and dynamically determining a set of coefficients to be used 
to combine the language model scores of each of the most likely words in the list, 
based on a context of the current word (column 17, lines 39-41 ); 

respectively combining the language model scores of each of the most 
likely words in the list to obtain a composite score for each of the most likely 
words in the list, using the set of coefficients determined therefor (column 16, 
lines 8-11); 

wherein said determining step comprises the steps of: 
dividing text data (column 1, lines 8-13) for training (column 15, lines 7-13) 
a plurality of sets of coefficients into partitions (frames; column 1, lines 8-13), 
depending on words counts (identifying words/scores) corresponding to each of 
the at least two language model (utterance/language models; column 15, lines 
60-67 with column 16, lines 20-25); and 

for each of the most likely words in the list, dynamically selecting (figure 
4A, element 405) the set of coefficients from among the plurality of sets of 
coefficients so as to maximize the likelihood (likelihood of the match; column 4, 
lines 1-20) of the text data with respect to the at least two language models 
(column 4, lines 46-67). 
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Regarding claims 3, 13 and 22, Gillick discloses the method wherein the 
at least two language models comprises a first and second language model, and 
said dividing step comprises the step of grouping, in a same partition, word 
triplets sub.1 w.sub.2w.sub.3 (trigram models) which have a count for the word 
pair w.sub.1 w.sub.2 (bigram models; column 1 , line 63 - column 2, line 5 and 
pair of words; column 14, lines 17-32) in first language model (first, second 
and/or third language models) greater than the count for the word pair 
w.sub.1w.sub.2 in the second language model (fourth language model; column 
18, lines 16-28). 

Regarding claims 5 and 15, Gillick discloses the method further 
comprising the step of, for each of the most likely words in the list, combining an 
acoustic score (acoustic score) and the composite score (previous score) to 
identify a group of most likely words to be further processed (column 10, lines 8- 
14). 

Regarding claims 6, 16 and 23, Gillick discloses the method wherein the 
group of most likely words contains less words than the list of most likely words 
(added to the list of words; column 7, lines 60 - column 8, lines 32). 

Regarding claim 7, Gillick discloses the method wherein the partitions are 
independent from the at least two language models (column 2, lines 1-5). 

Regarding claim 8, Gillick discloses the method further comprising the 
step of representing the set of coefficients by a weight vector comprising n- 
weights (interpolation weights), where n (lambda 1 and 2) equals a number of 
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language models in the system (column 16, lines 1-25), to identify the best 
corresponds to a user's utterance. 

Regarding claims 9, 17 and 24, Gillick discloses the method wherein said 
combining step comprises the steps of: 

for each of the most likely words in the list (column 1 , lines 43-47), 

multiplying a coefficient corresponding to a language model by a language 
model score corresponding to the language model to obtain a product for each 
of the at least two language models (column 1 0, lines 1 6-1 8); and 

summing the product for each of the at least two language models 
(column 10, lines 8-67), in order to determine the acoustic models that best 
matches the utterance. 

Regarding claims 10 and 18, Gillick discloses the method wherein the 
text data for training the plurality of sets of coefficients is different than language 
model text data used to train the at least two language models (column 16, lines 
26-29). 

Regarding claim 11, Gillick discloses a method for combining language 
model scores (column 16, lines 8-1 1) generated by at least two language models 
(variety of language models; column 2, lines 1-5) comprised in an Automatic 
Speech Recognition (ASR) system (figure 1, element 160; column 1, lines 6-7), 
said method comprising the steps of: 

generating a list (figure 1 1 , element 1 125) of most likely words for a 
current word in a word sequence uttered by a speaker (column 1 , lines 33-42), 
and acoustic scores corresponding to the most likely words (figure 9); 
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computing language model scores for each of the most likely words in the 
list (column 18 f lines 36-39), for each of the at least two language models; 

respectively and dynamically determining a weight vector to be used to 
combine the language model scores of each of the most likely words in the list 
based on the context of the current word (column 17, lines 39-41 ), the weight 
vector comprising n-weights (interpolation weights), wherein n (lambda 1 and 2) 
equals a number of language models in the system (column 16, lines 1-25), and 
each of the n-weights depend upon n-gram history counts (frequency of words; 
column 14, lines 26-32); and 

respectively combining the language model scores of each of the most 
likely words in the list to obtain a composite score for each of the most likely 
words in the list, using the set of coefficients determined therefor (column 16, 
lines 8-11). 

Regarding claim 19, Gillick discloses a combining system for combining 
language model scores (column 16, lines 8-11) generated by at least two 
language models (variety of language models; column 2, lines 1-5) comprised in 
an Automatic Speech Recognition (ASR) system (figure 1, element 160; column 
1 , lines 6-7), the ASR system having a fast match (processor) for generating a 
list (figure 1 1 , element 1 125) of most likely words for a current word in a word 
sequence uttered by a speaker and acoustic scores corresponding to the most 
likely words (column 1, lines 33-42) combing system comprising: 
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a language model score computation device (hardware or software) 
adapted to compute language model scores for each of the most likely words in 
the list (column 18, lines 36-39), for each of the at least two language models; 

a selection device (recognizer; figure 13, element 215) adapted to 
respectively and dynamically select a weight vector to be used to combine the 
language model scores of each of the most likely words in the list based on the 
context of the current word (column 17, lines 39-41), the weight vector 
comprising n-weights (interpolation weights), wherein n (lambda 1 and 2) equals 
a number of language models in the system (column 16, lines 1-25), and each of 
the n-weights depend upon n-gram history counts (frequency of words; column 
14, lines 26-32); and 

a combination device (select command; column 4, lines 46-50) adapted to 
respectively combining the language model scores of each of the most likely 
words in the list to obtain a composite score for each of the most likely words in 
the list, using the set of coefficients determined therefor (column 16, lines 8-11). 
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Claim Rejections - 35 USC § 103 

6. The following is a quotation of 35 U.S.C. 103(a) which forms the basis for 
all obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described 
as set forth in section 102 of this title, if the differences between the subject matter sought to 
be patented and the prior art are such that the subject matter as a whole would have been 
obvious at the time the invention was made to a person having ordinary skill in the art to which 
said subject matter pertains. Patentability shall not be negatived by the manner in which the 
invention was made. 

7. Claims 4 and 14 are rejected under 35 U.S.C. 103(a) as being 
unpatentable over Gillick in view of Goldenthal (U.S. Patent No. 6,625,749), 
hereinafter referenced as Goldenthal. 

Regarding claims 4 and 14, Gillick discloses speech recognition language 
models, but lacks disclosing the method wherein said selecting step comprises 
the step of applying the Baum Welch iterative algorithm to the plurality of sets of 
coefficients. 

Goldenthal discloses the method wherein said selecting step comprises 
the step of applying the Baum Welch iterative algorithm to the plurality of sets of 
coefficients (column 2, lines 41-43), for training Hidden Markov Models (HMM's). 

Therefore, it would have been obvious to one of ordinary skill in the art at 
the time the invention was made to modify Gillick's invention such that it applied 
the Baum Welch iterative algorithm, in order to handle speech problems (column 
2, lines 31-32). 
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Conclusion 

8. THIS ACTION IS MADE FINAL. Applicant is reminded of the extension of 
time policy as set forth in 37 CFR 1 .136(a). 

A shortened statutory period for reply to this final action is set to expire 
THREE MONTHS from the mailing date of this action. In the event a first reply is 
filed within TWO MONTHS of the mailing date of this final action and the advisory 
action is not mailed until after the end of the THREE-MONTH shortened statutory 
period, then the shortened statutory period will expire on the date the advisory 
action is mailed, and any extension fee pursuant to 37 CFR 1 .136(a) will be 
calculated from the mailing date of the advisory action. In no event, however, will 
the statutory period for reply expire later than SIX MONTHS from the mailing 
date of this final action. 

9. Any inquiry concerning this communication or earlier communications from 
the examiner should be directed to Jakieda R Jackson whose telephone number 
is 703.305.5593. The examiner can normally be reached on Monday through 
Friday from 7:30 a.m. to 5:00p.m. 

If attempts to reach the examiner by telephone are unsuccessful, the 
examiner's supervisor, Doris To can be reached on 703. 305.4827. The fax 
phone number for the organization where this application or proceeding is 
assigned is 703-872-9306. 
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Information regarding the status of an application may be obtained from 
the Patent Application Information Retrieval (PAIR) system. Status information 
for published applications may be obtained from either Private PAIR or Public 
PAIR. Status information for unpublished applications is available through 
Private PAIR only. For more information about the PAIR system, see http://pair- 
direct.uspto.gov. Should you have questions on access to the Private PAIR 
system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll- 
free). 
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June 3, 2004 



